Scalable Task-Oriented Parallelism for Structure Based Incomplete LU Factorization

نویسندگان

  • Xin Dong
  • Gene Cooperman
چکیده

ILU(k) is an important preconditioner widely used in many linear algebra solvers for sparse matrices. Unfortunately, there is still no highly scalable parallel ILU(k) algorithm. This paper presents the first such scalable algorithm. For example, the new algorithm achieves 50 times speedup with 80 nodes for general sparse matrices of dimension 160,000 that are diagonally dominant. The algorithm assumes that each node has sufficient memory to hold the matrix. The parallelism is task-oriented. We present experimental results for k = 1 and k = 2, which are the most commonly used cases in the practical applications. The results are presented for three platforms: a departmental cluster with Gigabit Ethernet; a high-performance cluster using an InfiniBand interconnect; and a simulation of a Grid computation with two or three participating sites.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

ILUM: A Multi-Elimination ILU Preconditioner for General Sparse Matrices

Standard preconditioning techniques based on incomplete LU (ILU) factorizations offer a limited degree of parallelism, in general. A few of the alternatives advocated so far consist of either using some form of polynomial preconditioning, or applying the usual ILU factorization to a matrix obtained from a multicolor ordering. In this paper we present an incomplete factorization technique based ...

متن کامل

Computing a block incomplete LU preconditioner as the by-product of block left-looking A-biconjugation process

In this paper, we present a block version of incomplete LU preconditioner which is computed as the by-product of block A-biconjugation process. The pivot entries of this block preconditioner are one by one or two by two blocks. The L and U factors of this block preconditioner are computed separately. The block pivot selection of this preconditioner is inherited from one of the block versions of...

متن کامل

Multi-objective and Scalable Heuristic Algorithm for Workflow Task Scheduling in Utility Grids

 To use services transparently in a distributed environment, the Utility Grids develop a cyber-infrastructure. The parameters of the Quality of Service such as the allocation-cost and makespan have to be dealt with in order to schedule workflow application tasks in the Utility Grids. Optimization of both target parameters above is a challenge in a distributed environment and may conflict one an...

متن کامل

A Comparison of D and D Data Mapping for Sparse LU Factorization with Partial Pivoting

This paper presents a comparative study of two data mapping schemes for parallel sparse LU factorization with partial pivoting on distributed memory machines Our previous work has developed an approach that incorporates static symbolic factoriza tion nonsymmetric L U supernode partitioning and graph scheduling for this problem with D column block mapping The D mapping is commonly considered mor...

متن کامل

Enhanced Parallel Multicolor Preconditioning Techniques for Linear Systems

When solving a linear system in parallel, a large overhead in using an incomplete LU factorization as a preconditioner may annihilate any gains made from the improved convergence. This overhead is due to the inherently sequential nature of such a preconditioning. Multicoloring of the subdomains assigned to processors is a common remedy for increasing the parallelism of a global ordering. Howeve...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/0803.0048  شماره 

صفحات  -

تاریخ انتشار 2008